Search Results: "Steinar H. Gunderson"

20 July 2016

Steinar H. Gunderson: Solskogen 2016 videos

I just published the videos from Solskogen 2016 on Youtube; you can find them all in this playlist. The are basically exactly what was being sent out on the live stream, frame for frame, except that the audio for the live shader compos has been remastered, and of course a lot of dead time has been cut out (the stream was sending over several days, but most of the time, only the information loop from the bigscreen). YouTube doesn't really support the variable 50/60 Hz frame rate we've been using well as far as I can tell, but mostly it seems to go to some 60 Hz upconversion, which is okay enough, because the rest of your setup most likely isn't free-framerate anyway. Solskogen is interesting in that we're trying to do a high-quality stream with essentially zero money allocated to it; where something like Debconf can use 2500 for renting and transporting equipment (granted, for two or three rooms and not our single stream), we're largely dependent on personal equipment as well as borrowing things here and there. (I think we borrowed stuff from more or less ten distinct places.) Furthermore, we're nowhere near the situation of two cameras, a laptop, perhaps a few microphones ; not only do you expect to run full 1080p60 to the bigscreen and switch between that and information slides for each production, but an Amiga 500 doesn't really have an HDMI port, and Commodore 64 delivers an infamously broken 50.12 Hz signal that you really need to deal with carefully if you want it to not look like crap. These two factors together lead to a rather eclectic setup; here, visualized beautifully from my ASCII art by ditaa: Solskogen 2016 A/V setup diagram Of course, for me, the really interesting part here is near the end of the chain, with Nageru, my live video mixer, doing the stream mixing and encoding. (There's also Cubemap, the video reflector, but honestly, I never worry about that anymore. Serving 150 simultaneous clients is just not something to write home about anymore; the only adjustment I would want to make would probably be some WebSockets support to be able to deal with iOS without having to use a secondary HLS stream.) Of course, to make things even more complicated, the live shader compo needs two different inputs (the two coders' laptops) live on the bigscreen, which was done with two video capture cards, text chroma-keyed on top from Chroma, and OBS, because the guy controlling the bigscreen has different preferences from me. I would take his screen in as a dirty feed and then put my own stuff around it, like this: Solskogen 2016 shader compo screenshot (Unfortunately, I forgot to take a screenshot of Nageru itself during this run.) Solskogen was the first time I'd really used Nageru in production, and despite super-extensive testing, there's always something that can go wrong. And indeed there was: First of all, we discovered that the local Internet line was reduced from 30/10 to 5/0.5 (which is, frankly, unusable for streaming video), and after we'd half-way fixed that (we got it to 25/4 or so by prodding the ISP, of which we could reserve about 2 for video demoscene content is really hard to encode, so I'd prefer a lot more) Nageru started crashing. It wasn't even crashes I understood anything of. Generally it seemed like the NVIDIA drivers were returning GL_OUT_OF_MEMORY on things like creating mipmaps; it's logical that they'd be allocating memory, but we had 6 GB of GPU memory and 16 GB of CPU memory, and lots of it was free. (The PC we used for encoding was much, much faster than what you need to run Nageru smoothly, so we had plenty of CPU power left to run x264 in, although you can of course always want more.) It seemed to be mostly related to zoom transitions, so I generally avoided those and ran that night's compos in a more static fashion. It wasn't until later that night (or morning, if you will) that I actually understood the bug (through the godsend of the NVX_gpu_memory_info extension, which gave me enough information about the GPU memory state that I understood I wasn't leaking GPU memory at all); I had set Nageru to lock all of its memory used in RAM, so that it would never ever get swapped out and lose frames for that reason. I had set the limit for lockable RAM based on my test setup, with 4 GB of RAM, but this setup had much more RAM, a 1080p60 input (which uses more RAM, of course) and a second camera, all of which I hadn't been able to test before, since I simply didn't have the hardware available. So I wasn't hitting the available RAM, but I was hitting the amount of RAM that Linux was willing to lock into memory for me, and at that point, it'd rather return errors on memory allocations (including the allocations the driver needed to make for its texture memory backings) than to violate the never swap contract. Once I fixed this (by simply increasing the amount of lockable memory in limits.conf), everything was rock-stable, just like it should be, and I could turn my attention to the actual production. Often during compos, I don't really need the mixing power of Nageru (it just shows a single input, albeit scaled using high-quality Lanczos3 scaling on the GPU to get it down from 1080p60 to 720p60), but since entries come in using different sound levels (I wanted the stream to conform to EBU R128, which it generally did) and different platforms expect different audio work (e.g., you wouldn't put a compressor on an MP3 track that was already mastered, but we did that on e.g. SID tracks since they have nearly zero ability to control the overall volume), there was a fair bit of manual audio tweaking during some of the compos. That, and of course, the live 50/60 Hz switches were a lot of fun: If an Amiga entry was coming up, we'd 1. fade to a camera, 2. fade in an overlay saying we were switching to 50 Hz so have patience, 3. set the camera as master clock (because the bigscreen's clock is going to go away soon), 4. change the scaler from 60 Hz to 50 Hz (takes two clicks and a bit of waiting), 5. change the scaler input in Nageru from 1080p60 to 1080p50, 6. steps 3,2,1 in reverse. Next time, I'll try to make that slightly smoother, especially as the lack of audio during the switch (it comes in on the bigscreen SDI feed) tended to confuse viewers. So, well, that was a lot of fun, and it certainly validated that you can do a pretty complicated real-life stream with Nageru. I have a long list of small tweaks I want to make, though; nothing beats actual experience when it comes to improving processes. :-)

13 July 2016

Steinar H. Gunderson: Cubemap 1.3.0 released

I just released version 1.3.0 of Cubemap, my high-performance video reflector. For a change, both new features are from (indirect) user requests; someone wanted support for raw TS inputs and it was easy enough to add. And then I heard a rumor that people had found Cubemap useless because it was logging so much . Namely, if you have a stream that's down, Cubemap will connect to it every 200 ms, and log two lines for every failed connection attempt. Now, why people discard software on ~50 MB/day of logs (more like 50 kB/day after compression) on a broken setup (if you have a stream that's not working, why not just remove it from the config file and reload?) instead of just asking the author is beyond me, but hey, eventually it reached my ears, and after a grand half hour of programming, there's rate-limiting of logging failed connection attempts. :-) The new version hasn't hit Debian unstable yet, but I'm sure it will very soon.

12 July 2016

Steinar H. Gunderson: Cisco WLC SNMP password reset

If you have a Cisco wireless controller whose admin password you don't know, and you don't have the right serial cable, you can still reset it over SNMP if you forgot to disable the default read/write community:
snmpset -Os -v 2c -c private 192.168.1.1 1.3.6.1.4.1.14179.2.5.5.1.3.5.97.100.109.105.110 s foobarbaz
Thought you'd like to know. :-P (There are other SNMP-based variants out there that rely on the CISCO-CONFIG-COPY-MIB, but older versions of the WLc software doesn't suppport it.)

26 June 2016

Steinar H. Gunderson: Nageru 1.3.0 released

I've just released version 1.3.0 of Nageru, my live software video mixer. Things have been a bit quiet on the Nageru front recently, for two reasons: First, I've been busy with moving (from Switzerland to Norway) and associated job change (from Google to MySQL/Oracle). Things are going well, but these kinds of changes tend to take, well, time and energy. Second, the highlight of Nageru 1.3.0 is encoding of H.264 streams meant for end users (using x264), not just the Quick Sync Video streams from earlier versions, which work more as a near-lossless intermediate format meant for transcoding to something else later. Like with most things video, hitting such features really hard (I've been doing literally weeks of continuous stream testing) tends to expose weaknesses in upstream software. In particular, I wanted x264 speed control, where the quality is tuned up and down live as the content dictates. This is mainly because the content I want to stream this summer (demoscene competitions) varies from the very simple to downright ridiculously complex (as you can see, YouTube just basically gives up and creates gray blocks). If you have only one static quality setting, you will have the choice between something that looks like crap for everything, and one that drops frames like crazy (or, if your encoding software isn't all that, like e.g. using ffmpeg(1) directly, just gets behind and all your clients' streams just stop) when the tricky stuff comes. There was an unofficial patch for speed control, but it was buggy, not suitable for today's hardware and not kept at all up to date with modern x264 versions. So to get speed control, I had to work that patch pretty heavily (including making it so that it could work in Nageru directly instead of requiring a patched x264) and then it exposed a bug in x264 proper that would cause corruption when changing between some presets, and I couldn't release 1.3.0 before that fix had at least hit git. Similarly, debugging this exposed an issue with how I did streaming with ffmpeg and the MP4 mux (which you need to be able to stream H.264 directly to HTML5 <video> without any funny and latency-inducing segmenting business); to know where keyframes started, I needed to flush the mux before each one, but this messes up interleaving, and if frames were ever dropped right in front of a keyframe (which they would on the most difficult content, even at speed control's fastest presets!), the duration field of the frame would be wrong, causing the timestamps to be wrong and even having pts < dts in some cases. (VLC has to deal with flushing in exactly the same way, and thus would have exactly the same issue, although VLC generally doesn't transcode variable-framerate content so well to begin with, so the heuristics would be more likely to work. Incidentally, I wrote the VLC code for this flushing back in the day, to be able to stream WebM for some Debconf.) I cannot take credit for the ffmpeg/libav fixes (that was all done by Martin Storsj ), but again, Nageru had to wait for the new API they introduce (that just signals to the application when a keyframe is about to begin, removing the need for flushing) to get into git mainline. Hopefully, both fixes will get into releases soon-ish and from there one make their way into stretch. Apart from that, there's a bunch of fixes as always. I'm still occasionally (about once every two weeks of streaming or so) hitting what I believe is a bug in NVIDIA's proprietary OpenGL drivers, but it's nearly impossible to debug without some serious help from them, and they haven't been responding to my inquiries. Every two weeks means that you could be hitting it in a weekend's worth of streaming, so it would be nice to get it fixed, but it also means it's really really hard to make a reproducible test case. :-) But the fact that this is currently the worst stability bug (and that you can work around it by using e.g. Intel's drivers) also shows that Nageru is pretty stable these days.

16 May 2016

Steinar H. Gunderson: stretch on ODROID XU4

I recently acquired an ODROID XU4. Despite being 32-bit, it's currently at the upper end of cheap SoC-based devboards; it's based on Exynos 5422 (which sits in Samsung Galaxy S5), which means 2 GHz quadcore Cortex-A15 (plus four slower Cortex-A7, in a big.LITTLE configuration), 2 GB RAM, USB 3.0, gigabit Ethernet, a Mali-T628 GPU and eMMC/SD storage. (My one gripe about the hardware is that you can't put on the case lid while still getting access to the serial console.) Now, since I didn't want it for HTPC or something similar (I wanted a server/router I could carry with me), I didn't care much about the included Ubuntu derivative with all sorts of Samsung modifications, so instead, I went on to see if I could run Debian on it. (Spoiler alert: You can't exactly just download debian-installer and run it.) It turns out there are lots of people who make Debian images, but they're still filled with custom stuff here and there. In recent times, people have put down heroic efforts to make unified ARM kernels; servers et al can now enumerate hardware using ACPI, while SoCs (such as the XU4) have a device tree file (loaded by the bootloader) containing a functional description of what hardware exists and how it's hooked up. And lo and behold, the 4.5.0 armmp kernel from stretch boots and mostly works! Well except for that there's no HDMI output. :-) There are two goals I'd like to achieve by this exercise: First, it's usually much easier to upgrade things if they are close to mainline. (I wanted support for sch_fq, for instance, which isn't in 3.10, and the vendor kernel is 3.10.) Second, anything that doesn't work in Debian is suddenly exposed pretty harshly, and can be filed bugs for and fixed which benefits not only XU4 users (if nothing else, because the custom distros have to carry less delta), but usually also other boards as most issues are of a somewhat more generic nature. Yet, the ideal seems to puzzle some of the more seasoned people in the ODROID user groups; I guess sometimes it's nice to come in as a na ve new user. :-) So far, I've filed bugs or feature requests to the kernel (#823552, #824435), U-Boot (#824356), grub (#823955, #824399), and login (#824391) and yes, that includes for the aforemented lack of HDMI output. Some of them are already fixed; with some luck, maybe the XU4 can be added next to the other Exynos5 board at the compatibility list for the armmp kernels at some point. :-) You can get the image at http://storage.sesse.net/debian-xu4/. Be sure to read the README and the linked ODROID forum post.

26 April 2016

Steinar H. Gunderson: Full stack

As I'm nearing the point where Nageru, my live video mixer, can produce a stream directly that is actually suitable to streaming directly to clients (without a transcoding layer in the chain), it struck me the other day how much of the chain I've actually had to touch: In my test setup, the signal comes into a Blackmagic Intensity Shuttle. At some point, I found what I believe is a bug in the card's firmware; I couldn't fix it, but a workaround was applied in the Linux kernel. (I also have some of their PCI cards, in which I haven't found any bugs, but I have found bugs in their drivers.) From there, it goes into bmusb, a driver I wrote myself. bmusb uses libusb-1.0 to drive the USB card from userspace but for performance and stability reasons, I patched libusb to use the new usbfs zerocopy support in the Linux kernel. (The patch is still pending review.) Said zerocopy support wasn't written by me, but I did the work to clean up the support and push it upstream (it's in the 4.6-rc* series). Once safely through bmusb, it goes of course into Nageru, which I wrote myself. Nageru uses Movit for pixel processing, which I also wrote myself. Movit in turn uses OpenGL; I've found bugs in all three major driver implementations, and fixed a Nageru-related one in Mesa (and in the process of debugging that, found bugs in apitrace, a most useful OpenGL debugger). Sound goes through zita-resampler to stretch it ever so gently (in case audio and video clocks are out of sync), which I didn't wrote, but patched to get SSE support (patch pending upstream). So now Nageru chews a bit on it, and then encodes the video using x264 (that's the new part in 1.3.0 of course, you need a fast CPU to do that as opposed to using Quick Sync). I didn't write x264, but I had to redo parts of the speedcontrol patch (not part of upstream; awaiting review semi-upstream) because of bugs and outdated timings, but I also found a bug in x264 proper (fixed by upstream, pending inclusion). Muxing is done through ffmpeg, where I actually found multiple bugs in the muxer (some of which are still pending fixes). Once the stream is safely encoded and hopefully reasonably standards-conforming (that took me quite a while), it goes to Cubemap, which I wrote, for reflection to clients. For low-bitrate clients, it takes a detour through VLC to get a re-encode on a faster machine to lower bitrate I've found multiple bugs in VLC's streaming support in the past (and also fixed some of them, plus written the code that interacts with Cubemap). From there it goes to any of several clients, usually a browser. I didn't write any browsers (thank goodness!), but I wrote the client-side JavaScript that picks the closest relay, and the code for sending it to a Chromecast. I also found a bug in Chrome for Android (will be fixed in version 50 or 51, although the fix was just about turning on something that was already in the works), and one in Firefox for Linux (fixed by patching GStreamer's MP4 demuxer, although they've since switched away from that to something less crappy). IE/Edge also broke at some point, but unfortunately I don't have a way to report bugs to Microsoft. There's also at least one VLC bug involved on the client side (it starts decoding frames too late if they come with certain irregular timestamps, which causes them to drop), but I want to verify that they still persist even after the muxer is fixed before I go deep on that. Moral of the story: If anyone wants to write a multimedia application and says I'll just use <framework, language or library XYZ>, and I'll get everything for free; I just need to click things together! , they simply don't know what they're talking about and are in for a rude awakening. Multimedia is hard, an amazing amount of things can go wrong, complex systems have subtle bugs, and there is no silver bullet.

6 April 2016

Steinar H. Gunderson: Nageru 1.2.0 released

I've just released version 1.2.0 of Nageru, my live video mixer. The main new feature is support for Blackmagic's PCI (and Thunderbolt) series of cards through their driver (in addition to the preexisting support for their USB3 cards, through my own free one), but the release is really much more than that. In particular, 1.2.0 has a lot of those small tweaks that takes it just to that point where it starts feeling like software I can use and trust myself. Of course, there are still tons of rough edges (and probably also bad bugs I didn't know about), but in a sense, it's the first real 1.x release. There's not one single thing I can point to it's more the sum. To that end, I will be using it at Solskogen this summer to run what's most likely the world's first variable-framerate demoparty stream, with the stream nominally in 720p60 but dropping to 720p50 during the oldschool compos to avoid icky conversions on the way, given that almost all oldschool machines are PAL. (Of course, your player needs to handle it properly to get perfect 50 Hz playback, too :-) Most likely through G-SYNC or similar, unless you actually have a CRT you can set to 50 Hz.) For more details about exactly what's new, see the NEWS file, or simply the git commit log.

31 March 2016

Steinar H. Gunderson: Signal

Signal is a pretty amazing app; it manages to combine great security with great simplicity. (It literally takes two minutes, even for an unskilled user, to set it up.) I looked at the Wikipedia article, and the list of properties the protocol provides is impressive; I had hardly any idea you would even want all of these. But I've tried to decode what they actually mean: (There are more guarantees and features for group chat.) Again, it's really impressive. Modern cryptography at its finest. My only two concerns is that it's too bound to telephone numbers (you can't have the same account on two devices, for instance it very closely mimics the SMS/MMS/POTS model in that regard), and that it's too clumsy to verify public keys for the IM part. It can show them as hex or do a two-way QR code scan, but there's no NFC support, and there's no way to read e.g. a series of plaintext words instead of the fingerprint. (There's no web of trust, but probably that's actually for the better.) I hear WhatsApp is currently integrating the Signal protocol (or might be done already it's a bit unclear), but for now, my bet is on Signal. Install it now and frustrate NSA. And get free SMS/MMS to other Signal users (which are growing in surprising numbers) while you're at it. :-)

11 March 2016

Steinar H. Gunderson: Agon and the Candidates tournament

The situation where Agon (the designated organizer of the Chess World Championship, and also the Candidates tournament, the prequalifier to said WC) is trying to claim exclusive rights of the broadcasting of the moves (not just the video) is turning bizarre. First of all, they have readily acknowledged they have no basis in copyright to do so; chess moves, once played, are facts and cannot be limited. They try to jump through some hoops with a New York-specific doctrine (even though the Candidates, unlike the World Championship, is played in Moscow) about hot news , but their main weapon seems to be that they simply will throw out anyone from the hall who tries to report on the moves, and then try to give them only to those that promise not to give them on. This leads to the previously unheard-of situation where you need to register and accept their terms just to get to watch the games in your browser. You have to wonder what they will be doing about the World Championship, which is broadcast unencrypted on Norwegian television (previous editions also with no geoblock). Needless to say, this wasn't practically possible to hold together. All the big sites (like Chessdom, ChessBomb and Chess24) had coverage as if nothing had happened. Move sourcing is a bit of a murky business where nobody really wants to say where they get the moves from (although it's pretty clear that for many tournaments, the tournament organizers will simply come to one or more of the big players with an URL they can poll at will, containing the games in the standard PGN format), and this was no exception ChessBomb went to the unusual move of asking their viewers to download Tor and crowdsource the moves, while Chessdom and Chess24 appeared to do no such thing. In fact, unlike Chessdom and ChessBomb, Chess24 didn't seem to say a thing about the controversy, possibly because they now found themselves on the other side of the fence from Norway Chess 2015, where they themselves had exclusive rights to the PGN in a similar controversy although it would seem from a tweet that they were perfectly okay with people just re-broadcasting from their site if they paid for a (quite expensive) premium membership, and didn't come up with any similar legal acrobatics to try to scare other sites. However, their ToS were less clear on the issue, and they didn't respond to requests for clarification at the time, so I guess all of this just continues to be on some sort of gentleman's agreement among the bigger players. (ChessBomb also provides PGNs for premium members for the tournaments they serve, but they expressly prohibit rebroadcast. They claim that for the tournaments they host, which is a small minority, they provide free PGNs for all.) Agon, predictably, sent out angry letters where they threatened to sue the sites in question, although it's not clear at all to me what exactly they would sue for. Nobody seemed to care, except one entity TWIC, which normally has live PGNs from most tournaments, announced they would not be broadcasting from the Candidats tournament. This isn't that unexpected, as TWIC (which is pretty much a one-man project anyway) mainly is about archival, where they publish weekly dumps of all top-level games played that week. This didn't affect a lot of sites, though, as TWIC's live PGNs are often not what you'd want to base a top-caliber site on (they usually lack clock information, and moves are often delayed by half a minute or so). I run a hobby chess relay/analysis site myself (mainly focusing on the games of Magnus Carlsen), though, so I've used TWIC a fair bit in the past, and if I were to cover the Candidates tournament (I don't plan to do so, given Agon's behavior, although I plan to cover the World Championship itself), I might have been hit by this. So, that was the background. The strange part started when worldchess.com, Agon's broadcasting site, promptly went down during the first round of the Candidates tournament today Agon blamed DDoS, which I'm sure is true, but it's unclear exactly how strong the DDoS was, and if they did anything at all to deal with it other than to simply wait it out. But this lead to the crazy situation where the self-declared monopolist was the only big player not broadcasting the tournament in some form. And now, in the trully bizarre move, World Chess is publishing a detailed rebuttal of Agon's arguments, explaining how it is bad for chess, not juridically sound, and also morally wrong. Yes, you read that right; Agon's broadcast site is carrying an op-ed saying Agon is wrong. You at least have to give them credit for not trying to censor their columinst when he says something they don't agree with. Oh, and if you want those PGNs? I will, at least for the time being, be pushing them out live on http://pgn.sesse.net/. I have not gone into any agreement with Agon, and they're hosted in Norway, far from any New York-specific doctrines. So feel free to relay from them, although I would of course be happy to know if you do.

2 March 2016

Steinar H. Gunderson: Nageru FOSDEM talk video

I got tired of waiting for the video of my FOSDEM talk about Nageru, my live video mixer, to come out, so I made an edit myself. (This wouldn't have been possible without getting access to the raw video from the team, of course.) Of course, in maximally bad timing, the video team published their own (auto-)edit of mine and a lot of other videos on the very same day (and also updated their counts to a whopping 151 fully or partial lost videos out of a bit over 500!), so now there's two competing ones. Of course, since I have only one and not 500 videos to care about, I could afford to give it a bit more love; in particular, I spliced in digital versions of the original slides where appropriate, modified the audio levels a bit when there are audience questions, added manual transcriptions and so on, so I think it ended up quite a bit better. Nageru itself is also coming pretty nicely along, with an 1.1.0 release that supports DeckLink PCI cards (not just the USB ones) and also the NVIDIA proprietary driver, for much increased oomph. It's naturally a bit more quiet now than it was, though conferences always tend to generate a flurry of interest, and then you get back to other interests afterwards. You can find the talk on YouTube; I'll also be happy to provide a full-quality master of my edit if FOSDEM or anyone else wants. Enjoy :-)

25 February 2016

Steinar H. Gunderson: Frankenmachine

My desktop machine, from the back My video testing machine has now seemingly accumulated: Fascinatingly enough, Debian actually has no problem having installed Intel, NVIDIA and AMD graphics drivers all at once. I can't run more than one at the same time, though; somehow X servers are still bound to this concept of vtys (so you can only run one), and NVIDIA/AMD drivers crash if you try to run them at the same time. You almost certainly could dedicate each card to a VM (PCI-Express passthrough) and run it that way, though, but just being able to switch is fine. Now about that fan noise

23 February 2016

Steinar H. Gunderson: Multithreaded OpenGL driver quality

Multithreaded OpenGL is tricky, both for application programmers and drivers. Based on some recent experience with developing an application that would like to run on all three major desktop GPU vendors, allow me to present my survey with sample size 1. I'll let you draw your own conclusions: Curiously, somewhat similar to an Intel/Mesa bug I reported back in July, and which has no response yet. Update: The NVIDIA driver is now up to exposing three bugs in my code. One of them even came with a textual error message when running with a debug context (in apitrace).

31 January 2016

Steinar H. Gunderson: Back from FOSDEM

Back safely from FOSDEM; just wanted to write down a few things while it's still fresh. FOSDEM continues to be huge. There are just so many people, and it overflows everywhere into ULB even the hallways during the talks are packed! I don't have a good solution for this, but I wish I did. Perhaps some rooms could be used as overflow rooms , ie., do a video link/stream to them, so that more people can get to watch the talks in the most popular rooms. The talks were of variable quality. I were to some that were great and some that were less than great, and it's really hard to know beforehand from the title/abstract alone; FOSDEM is really a place that goes for breadth. But the main attraction keeps being bumping into people in the hallways; I met a lot of people I knew (and some that I didn't know), which was the main thing for me. My own talk about Nageru, my live video mixer, went reasonably well; the room wasn't packed (about 75% full) and the live demo had to be run with only one camera (partly because the SDI camera I was supposed to borrow couldn't get to the conference due to unfortunate circumstances, and partly because I had left a command in the demo script to run with only one anyway), but I got a lot of good questions from the audience. The room was rather crummy, though; with no audio amplification, it was really hard to hear in the back (at least on the talks I visited myself in the same room), and half of the projector screen was essentially unreadable due to others' heads being in the way. The slides (with speaker notes) are out on the home page, and there will be a recording as soon as FOSDEM publishes it. All in all, I'm happy I went; presenting for an unknown audience is always a thrill, especially with the schedule being so tight. Keeps you on your toes. Lastly, I want to put out a shoutout to the FOSDEM networking team (supported by Cisco, as I understand it). The wireless was near-spotless; I had an issue reaching the Internet the first five minutes I was at the conference, and then there was ~30 seconds where my laptop chose (or was directed towards) a far-away AP; apart from that, it was super-responsive everywhere, including locations that were far from any auditorium. Doing this with 7000 heavy users is impressive. And NAT64 as primary ESSID is bold =) PS: Uber, can you please increase the surge pricing during FOSDEM next year? It's insane to have zero cars available for half an hour, and then only 1.6x surge at most.

29 January 2016

Steinar H. Gunderson: En route to FOSDEM

FOSDEM is almost here! And in an hour or so, I'm leaving for the airport. My talk tomorrow is about Nageru, my live video mixer. HDMI/SDI signals in, stream that doesn't look like crap out. Or, citing the abstract for the talk:
Nageru is an M/E (mixer/effects) video mixer capable of high-quality output on modest hardware. We'll go through the fundamental goals of the project, what we can learn from the outside world, performance challenges in mixing 720p60 video on an ultraportable laptop, and how all of this translates into a design and implementation that differs significantly from existing choices in the free software world.
Saturday 17:00, Open Media devroom (H.2214). Feel free to come and ask difficult questions. :-) (I've heard there's supposed to be a live stream, but there's zero public information on details yet. And while you can still ask difficult questions while watching the stream, it's unlikely that I'll hear them.)

25 January 2016

Steinar H. Gunderson: Chess endgame tablebases

A very short post: This link contains an interesting exposition of the 50-move rule in chess, and what it means for various endings. You can probably stop halfway, though; most of it is only interest for people deeply into endgame theory. Personally, I think DTZ50, as used by the Syzygy tablebases, is the best tradeoff for computer chess. It always produces the correct result (never throws away a win as a draw, or a draw as a loss), but the actual mates are of suboptimal length and look very strange (e.g., it will happily give away most of its pieces and then play very tricky endgames to mate). Then again, if you ever want to swindle against a non-optimal opponent (ie., try to make the position as hard as possible to play, to possibly convert e.g. a loss to a draw), you've opened up an entirely new can of worms. :-) Update: Eric P Smith has written a different explanation of the DTM50 metric that's probably easier to understand, although it contains less new information than the other one.

6 January 2016

Steinar H. Gunderson: IPv6 non-alternatives: DJB's article, 13 years later

With the world passing 10% IPv6 penetration over the weekend, we see the same old debates coming up again; people claiming IPv6 will never happen (despite several years now of exponential growth!), and that if they had only designed it differently, it would have been all over by now. In particular, people like to point to a 2002 3 article by D. J. Bernstein, complete with rants about how Google would never set up useless IPv6 addresses (and then they did that in 2007 I was involved). It's difficult to understand exactly what the article proposes since it's heavy on calling people idiots and light on actual implementation details (as opposed to when DJB's gotten involved in other fields; e.g. thanks to him we now have elliptical curve crypto that doesn't suck, even if the reference implementation was sort of a pain to build), but I will try to go through it nevertheless and show how I cannot find any way it would work well in practice. One thing first, though: Sorry, guys, the ship has sailed. Whatever genius solution DJB may have thought up that I'm missing, and whatever IPv6's shortcomings (they're certainly there), IPv6 is what we have. By now, you can not expect anything else to arise and take over the momentum; we will either live with IPv6 or die with IPv4. So, let's see what DJB says. As far as I can see, his primary call is for a version of IPv6 where the address space is an extension of the IPv4 space. For sake of discussion, let's call that IPv4+ , although it would share a number of properties with IPv6. In particular, his proposal requires changing the OS and other software on every single end host out there, just as IPv6; he readily admits that and outlines how it's done in rough terms (change all structs, change all configuration files, change all databases, change all OS APIs, etc.). From what I can see, he also readily admits that IPv4 and IPv4+ hosts cannot talk to each other, or more clearly, we cannot start using the extended address space before almost everybody has IPv4+ capable software. (E.g., quote: Once these software upgrades have been done on practically every Internet computer, we'll have reached the magic moment: people can start relying on public IPv6 addresses as replacements for public IPv4 addresses. ) So, exactly how does the IPv4 address space fit into the IPv4+ address space? The article doesn't really say anything about this, but I can imagine only two strategies: Build the IPv4+ space around the IPv4 space (ie., the IPv4 space occupies a little corner of the IPv4+ space, similar to how v4-mapped addresses are used within software but not on the wire today, to let applications do unified treatment of IPv4 addresses as a sort of special IPv6 address), or build it as a hierarchical extension. Let's look at the former first; one IPv4 address gives you one IPv4+ addresses. Somehow this seems to give you all the disadvantages of IPv4 and all the disadvantages of IPv6. The ISP is not supposed to give you any more IPv4+ addresses (or at least DJB doesn't want to contact his ISP about more also saying that the fact that automatic address distribution does not change his argument), so if you have one, you're stuck with one. So you still need NAT. (DJB talks about proxies , but I guess that the way things evolved, this either actually means NAT, or it talks about the practice of application-level proxies such as Squid or SOCKS proxies to reach the Internet, which really isn't commonplace anymore, so I'll assume for the sake of discussion it means NAT.) However, we already do NAT. The IPv4 crunch happened despite ubiquitous NAT everywhere; we're actually pretty empty. So we will need to hand out IPv4+ addresses at the very least to new deployments, and also probably reconfigure every site that wants to expand and is out of IPv4 addresses. ( Site here could mean any organizational unit, such as if your neighborhood gets too many new subscribers for your ISP's local addressing scheme to have enough addresses for you.) A much more difficult problem is that we now need to route these addresses on the wire. Ironically, the least clear part of DJB's plan is step 1, saying we will extend the format of IP packets to allow 16-byte addresses ; how exactly will this happen? For this scheme, I can only assume some sort of IPv4 option that says the stuff in the dstaddr field is just the start and doesn't make sense as an IPv4 address on its own; here are the remaining 12 bytes to complete the IPv4+ address . But now your routers need to understand that format, so you cannot do with only upgrading the end hosts; you also need to upgrade every single router out there, not just the end hosts. (Note that many of these do routing in hardware, so you can't just upgrade the software and call it a day.) And until that's done, you're exactly in the same situation as with IPv4/IPv6 today; it's incompatible. I do believe this option is what DJB talks about. However, I fail to see exactly how it is much better than the IPv6 we got ourselves into; you still need to upgrade all software on the planet and all routers on the planet. The benefit is supposedly that a company or user that doesn't care can just keep doing nothing, but they do need to care, since they need to upgrade 100% of their stuff to understand IPv4+ before we can start even deploying it alongside IPv4 (in contrast with IPv6, where we now have lots of experience in running production networks). The single benefit is that they won't have to renumber until they need to grow, at which point they need to anyway. However, let me also discuss the other possible interpretation, namely that of the IPv4+ address space being an extension of IPv4, ie. if you have 1.2.3.4 in IPv4, you have 1.2.3.4.x.x.x.x or similar in IPv4+. (DJB's article mentions 128-bit addresses and not 64-bit, though; we'll get to that in a moment.) People keep bringing this up, too; it's occasionally been called BangIP (probably jokingly, as in this April Fool's joke) due to the similarity with how explicit mail routing would work before SMTP became commonplace. I'll use that name, even though others have been proposed. The main advantage of BangIP is that you can keep your Internet core routing infrastructure. One way or the other, they will keep seeing IPv4 addresses and IPv4 packets; you need no new peering arrangements etc.. The exact details are unclear, though; I've seen people suggest GRE tunneling, ignoring problems they have through NAT, and I've seen suggestions of IPv4 options for source/destination addresses, also ignoring that someting as innocious as setting the ECN bits has been known to break middleboxes left and right. But let's assume you can pull that off, because your middlebox will almost certainly need to be the point that decapsulates BangIP anyway and converts it to IPv4 on the inside, presumably with a 10.0.0.0/8 address space so that your internal routing can keep using IPv4 without an IPv4+ forklift upgrade. (Note that you now lose the supposed security benefit of NAT, by the way, although you could probably encrypt the address.) Of course, your hosts will need to support IPv4+ still, and you will need some way of communicating that you are on the inside of the BangIP boundary. And you will need to know what the inside is, so that when you communicate on this side, you'll send IPv4 and not IPv4+. (For a home network with no routing, you could probably even just do IPv4+ on the inside, although I can imagine complications.) But like I wrote above, experience has shown us that 32 extra bits isn't enough. One layer of NAT isn't doing it, we need two. You could imagine the inter-block routability of BangIP helping a fair bit here (e.g., a company with too many machines for 10.0.0.0/8 could probably easily get more addresses for more external IPv4 addresses, yielding 10.0.0.0/8 blocks), but ultimately, it is a problem that you chop the Internet off in two distinct halves that work very differently. My ISP will probably want to use BangIP for itself, meaning I'm on the outside of the core; how many of those extra bits will they allocate for me? Any at all? Having multiple levels of bang sounds like pain; effectively we're creating a variable-length address. Does anyone ever want that? From experience, when we're creating protocols with variable-length addresses, people just tend to use the maximum level anyway, so why not design it with 128-bit to begin with? (The original IP protocol proposals actually had variable-length addresses, by the way.) So we can create our 32/96 BangIP , where the first 32 bits are for the existing public Internet, and then every IPv4 address gives you a 2^96 addresses to play with. (In a sense, it reminds me of 6to4, which never worked very well and is now thankfully dead.) However, this makes the inside/outside-core problem even worse. I now need two very different wire protocols coexisting on the Internet; IPv4+ (which looks like regular IPv4 to the core) for the core, and a sort of IPv4+-for-the-outside (similar to IPv6) outside it. If I build a company network, I need to make sure all of my routers are IPv4+-for-the-outside and talk that, while if I build the Internet core, I need to make sure all of my connections are IPv4 since I have no guarantee that I will be routable on the Internet otherwise. Furthermore, I have a fixed prefix that I cannot really get out of, defined by my IPv4 address(es). This is called hierarchical routing , and the IPv6 world gave it up relatively early despite it sounding like a great idea at first, because it makes multihoming a complete pain: If I have an address 1.2.3.4 from ISP A and 5.6.7.8 from ISP B, which one do I use as the first 32 bits of my IPv4+ network if I want it routable on the public Internet? You could argue that the solution for me is to get an IPv4 PI netblock (supposedly a /24, since we're not changing the Internet core), but we're already out of those, which is why we started this thing to begin with. Furthermore, if the IPv4/IPv4+ boundary is above my immediate connection to the Internet (say, ISP A doesn't have an IPv4 address, just IPv4+), I'm pretty hosed; I cannot announce an IPv4 netblock in BGP. The fact that the Internet runs on largely the same protocol everywhere is a very nice thing; in contrast, what is described here really would be a mess! So, well. I honestly don't think it's as easy to just do extension instead of alternative when it comes to the address spaces. We'll just need to deal with the pain and realize that upgrading the equipment and software is the larger part of the job anyway, and we'll need to do that no matter what solution we go with. Congrats on reaching 10%! Now get to work with the remaining 90%.

28 December 2015

Steinar H. Gunderson: The difference between logs and no logs

An USB3 device of mine stopped working one day. In Windows (its native environment; Linux is not officially supported), there would be a device plugged in sound and then nothing. Nothing in the event log, no indication there was ever a device of any kind in Device Manager. In Linux, after 20 seconds or so, this would come up:
[   71.831659] usb usb2-port1: Cannot enable. Maybe the USB cable is bad?
I bought a new and shorter cable. The card started working again.

27 December 2015

Steinar H. Gunderson: Going to FOSDEM 2016

I've ordered my tickets and my hotel room, so it's clear; I'm going to FOSDEM 2016! I'll be having a talk in the Open Media devroom (H.2214), Saturday 17:00, about my new project, Nageru. (Actually, it's sort of a launch as well, since the source code isn't out yet.) So, what is Nageru, might you ask? Well it has to do with video. And it's made for 2016, not 1996, so it uses your GPU via Movit, also released at FOSDEM two years ago. For the rest, come see my talk :-)

17 December 2015

Steinar H. Gunderson: sRGB weirdness: Doing the right thing causes a worse result

A while back, I wrote about how you should always do image calculations in linear light, not gamma space; since then, Tom Forsythe has come out with a much better metaphor than I could cough up myself, namely that you should look at sRGB value as compressed values, not integers. So naturally, when I needed a deinterlacing filter for Movit, my GPU filter library, I wanted to do it in linear light. (In fact, all pixel processing in Movit is in linear light, except in the cases where it's 100% equivalent to do it on the gamma-encoded values and the conversion can be skipped for speed.) After some deliberations, I made an implementation of Martin Weston's three-field deinterlacing filter, known in ffmpeg as w3fdif. I won't discuss deinterlacing in detail here since it's really hard, but I'll note that w3fdif works by way of applying two filters; low-frequency components are estimated from the current field, and high-frequency from the previous and next fields. (This makes intuitive sense; you cannot get the HF information from the current field since you don't have the lines you need for that, but you can hope it hasn't changed too much.) Aha! A filter. Brilliant, that's exactly when linear light means the most, too. But when implementing it, I found that it sometimes looked weird -- and ffmpeg's implementation (which works directly on the sRGB values, which we already established is wrong) didn't. After lots of tweaking back and forth, I decided to set up a synthetic test to settle this once and for all; I took a static test picture (eliminating everything related to video capture, codecs, frame rates, etc.) and compared to ffmpeg. Of course, deinterlacing is all about movement, but this would do to try to nail things down. So after lots of fruitless debugging, I did a last-ditch: What if I turned off the gamma conversions? This gave me a huge surprise; indeed it looked better! I'll provide some upscaled versions; left is the original image, middle is deinterlaced in sRGB space and right is deinterlaced in linear light: Original picture Deinterlaced in sRGB space Deinterlaced in linear light If that's not dramatic enough for you (trust me, you'll notice it when it's animated as you flicker through the two different fields), here's an even more high-contrast example (same ordering): Original picture Deinterlaced in sRGB space Deinterlaced in linear light I guess it's obvious in retrospect what happens; the HF filter picks up residue from its outer edges, and even if the coefficient is just 0.031 (well, times two; it adds that value from both the previous and next field), 3% the photons of a fully lit pixel (which is what you get when working in linear light) is actually quite a bit, whereas a 3% gray is only pixel value 8 or so, which is barely visible. So what am I to make of this? I'm honestly not sure. Maybe it's somehow related to that these filter values were chosen in 1988, where they were relatively unlikely to do this in linear light (although if they did it with analog circuitry, perhaps they could?) and it was tweaked to look good despite doing the wrong thing. Or maybe I need to change my approach here entirely. It always sucks when your fundamental assumptions are challenged, but I think it shows once again that if you notice something funny in your output, you really ought to investigate, because you never know how deep the rabbit hole goes. :-/

10 November 2015

Steinar H. Gunderson: HTTPS-enabling gitweb

If you have a HTTPS-enabling proxy in front of your gitweb, so that it tries to do <base href="http://..."> (because it doesn't know that the user is actually using HTTPS), here's the Apache configuration variable to tell it otherwise:
SetEnv HTTPS ON
So now git.sesse.net works with HTTPS after Let's Encrypt, without the CSS being broken. Woo. (Well, the clone URL still says http. So, halfway there, at least.)

Next.

Previous.